-
Notifications
You must be signed in to change notification settings - Fork 3.2k
NV TensorRT RTX EP - initial commit #24456
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
New EP - currently based on existing TensorRT EP but meant to be used on RTX GPUs with a lean version of TensorRT.
include/onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_options.h
Fixed
Show fixed
Hide fixed
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider_helper.cc
Fixed
Show fixed
Hide fixed
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider_info.cc
Fixed
Show fixed
Hide fixed
onnxruntime/core/providers/nv_tensorrt_rtx/nv_provider_options_internal.h
Fixed
Show fixed
Hide fixed
Default to minimal CUDA compile
Unload the model once it is no longer needed. Bug: 5225623
onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider_custom_ops.cc
Fixed
Show fixed
Hide fixed
please address lintrunner failure |
Fix memory paging issue seen with large models.
…ion_options, const OrtLogger& session_logger)
NV TensorRt Rtx Ep
Ishwar/nv tensorrt rtx ep
Add support for python bindings of NV TensorRT RTX EP
use setShapeValuesV2 with 64 bit types to support latest TRT RTX builds.
use setShapeValuesV2
Also fix a couple of bugs to ensure the options are actually passed down.
Clean up old APIs and options
fix formatting
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows GPU Doc Gen CI Pipeline, Windows ARM64 QNN CI Pipeline, Windows x64 QNN CI Pipeline |
Azure Pipelines successfully started running 5 pipeline(s). |
@chilo-ms please review. |
Adds some testing infrastructure and removes lots of deprecated options
@chilo-ms We will happily take any guidance on how to run more test using NV EP to find remaining bugs or implementation gaps. |
…ixes remove debug logging
fix compile error in test
We need to add a new pipeline/CI for building and testing this new NV EP which this PR hasn't done yet.
|
To add a new NV EP pipeline in GitHub Action, please duplicate from TRT EP's first and then modify accordingly. |
/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows x64 QNN CI Pipeline |
Azure Pipelines successfully started running 5 pipeline(s). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code in ORT that makes this new NV EP known to ORT looks good to me.
As for EP code and test/validation is not the focus for this PR, we can discuss later.
New EP - currently based on existing TensorRT EP but meant to be used on RTX GPUs with a lean version of TensorRT.
Description
Adding a new EP based on TensorRT EP. This is going to use a special version of TensorRT optimized for RTX GPUs. In the future we plan to make changes to the EP to streamline it further (e.g, get rid of dependency on CUDA EP completely).
Motivation and Context
The new TensorRT for RTX is going to have:
This effort is also targeting WCR ML workflows.